Skip to main content
Scour
Browse
Getting Started
Login
Sign Up
You are offline. Trying to reconnect...
Close
Copied to clipboard
Close
Unable to share or copy to clipboard
Close
🎯 CPU Dispatch
Specific
SIMD Selection, Runtime Detection, Feature Flags, Performance
Filter Results
Timeframe
Fresh
Past Hour
Today
This Week
This Month
Feeds to Scour
Subscribed
All
Scoured
159878
posts in
14.4
ms
Iteratively
optimizing an
SPSC
queue
🎯
Ring Buffers
blog.c21-mac.com
·
4d
·
r/cpp
·
…
Beating Python’s GIL: Achieving a 130x
Speedup
in Batch Processing with Rust and
Rayon
🦀
MIR Optimization
medium.com
·
2d
·
…
facebookincubator/dispenso
: The project provides high-performance concurrency, enabling highly parallel computation.
⏱️
Async Runtimes
github.com
·
21h
·
Hacker News
·
…
'Performance without compromise': AMD debuts first dual 3D V-Cache Ryzen CPU in potential showdown against
Threadripper
and
EPYC
siblings
🧠
Memory Hierarchy
techradar.com
·
2d
·
…
Accelerate CPU-based AI inference workloads using Intel
AMX
on Amazon
EC2
🗺️
Region Inference
aws.amazon.com
·
3d
·
…
Intel Delivers Open, Scalable AI Performance in
MLPerf
Inference
v6.0
🗺️
Region Inference
newsroom.intel.com
·
1d
·
…
Metal Quantized Attention: pulling M5 Max ahead with
Int8
matrix
multiplication
🗺️
Region Inference
releases.drawthings.ai
·
1d
·
Hacker News
·
…
Using
GPT-4o-mini
for Simple Tasks and
GPT-4o
for Complex
Ones
⏲️
Embedded GC
kalibr.systems
·
3d
·
DEV
·
…
Supercharging
Redpanda
Streaming with profile-guided optimization
📈
Performance Tools
redpanda.com
·
1d
·
…
MXFP8
GEMM: Up to 99% of
cuBLAS
Performance Using CUDA and PTX
🔬
Nanopasses
danielvegamyhre.github.io
·
5d
·
Hacker News
·
…
Stack vs
malloc
: real-world benchmark shows 2–6x
difference
📚
Stack Data Structures
blog.stackademic.com
·
1d
·
DEV
·
…
We test Intel's secret
sauce
for Windows 11 performance in its latest Core Ultra
processors
📊
Profiling Tools
neowin.net
·
4d
·
…
Scaling AI
Workloads
in Java Without Breaking Your
APIs
⚡
Interpreter Optimization
dzone.com
·
6d
·
…
APL
Performance
🔀
SIMD Programming
aplwiki.com
·
3d
·
Hacker News
·
…
Why I’m Building a
Database
Engine in C#
🗃️
Query Compilation
nockawa.github.io
·
6d
·
Hacker News
·
…
[Benchmark]
740k
QPS
Single-thread / 1.45M Dual-thread on a VM. Encountering fluctuations and seeking expert analysis.
🌐
WASM Runtimes
github.com
·
1d
·
r/java
·
…
Systematic
Analysis of CPU-Induced
Slowdowns
in Multi-GPU LLM Inference (Georgia Tech)
🗺️
Region Inference
semiengineering.com
·
6d
·
…
Stop Choosing: Get C++ Performance in Python
Algos
with C++26 -- Richard
Hickling
🎭
Racket Modules
isocpp.org
·
6d
·
…
m0at/rvllm
:
rvLLM
: High-performance LLM inference in Rust. Drop-in vLLM replacement.
🦀
MIR Optimization
github.com
·
5d
·
Hacker News
·
…
btursunbayev/nvsonar
: Active GPU diagnostic tool that identifies performance bottlenecks, detects anomalous patterns, and gives actionable recommendations
📊
Profiling Tools
github.com
·
3d
·
Hacker News
·
…
Loading...
Loading more...
Page 2 »
Keyboard Shortcuts
Navigation
Next / previous item
j
/
k
Open post
o
or
Enter
Preview post
v
Post Actions
Love post
a
Like post
l
Dislike post
d
Undo reaction
u
Recommendations
Add interest / feed
Enter
Not interested
x
Go to
Home
g
h
Interests
g
i
Feeds
g
f
Likes
g
l
History
g
y
Changelog
g
c
Settings
g
s
Browse
g
b
Search
/
Pagination
Next page
n
Previous page
p
General
Show this help
?
Submit feedback
!
Close modal / unfocus
Esc
Press
?
anytime to show this help